Linear Models for Panel Data
We will do something
Most (economic) research today uses panel
Key advantage of panel data: several observations of potential outcomes per unit
Begin with the simplest possible panel setting with binary treatment:
Object of interest: “average effect of treatment”
Simplest approach: compute average change in \(Y_{it}\) across periods \[ \widehat{AE}_{ES} = \dfrac{1}{N}\sum_{i=1}^N (Y_{i2}- Y_{i1}). \tag{1}\]
Estimator (1) — simplest example of event study estimators (see Freyaldenhoven et al. 2021; Miller 2023).
Possible empirical framework
Effect of interest: change in stock prices due to the announcement of iPhone
Proposition 1 (Asymptotics for \(\widehat{AE}_{ES}\)) Let
Then \[ \widehat{AE}_{ES} \xrightarrow{p} \E[Y_{i2} - Y_{i1}]. \]
Is \(\E[Y_{i2} - Y_{i1}]\) interesting (=causal)?
Need a causal framework to talk about causal effects!
Work in the familiar potential outcomes framework:
For short, use \(Y_{it}^d\) where \(d=0, 1\)
Potential and realized outcomes are connected as \[ Y_{i2} = Y_{i2}^1, \quad Y_{i1} = Y_{i1}^0. \]
It follows that \[ \widehat{AE}_{ES} \xrightarrow{p} \E[Y_{i2}^2- Y_{i1}^1]. \]
\(\E[Y_{i2}^2- Y_{i1}^1]\) is not necessarily a treatment effect — mixes effect of treatment and effects of time!
Context Again consider the iPhone example. Then
We see combination of both changes \[ Y_{i2} - Y_{i1} = Y_{i2}^1- Y_{i1}^0 = [Y_{i2}^1- Y_{i2}^0] + [Y_{i2}^0 - Y_{i0}^0] \]
Simple solution: rule out changes over time
Assumption: no variation in potential outcomes \[ Y_{i2}^d= Y_{i1}^d, \quad d=0, 1 \]
Then \(\widehat{AE}_{ES}\) is estimating a causal parameter — average effects \[ \begin{aligned} \widehat{AE}_{ES} & \xrightarrow{p} \E[Y_{i1}^1- Y_{i1}^0] = \E[Y_{i2}^1- Y_{i2}^0] \end{aligned} \]
Time invariance very strict. Why use it if we only work with averages?
Weaker assumption:
Assumption (no trends):
\[
\E[Y_{i2}^d] = \E[Y_{i1}^d], \quad d=0, 1
\]
Allows random variation in potential outcomes between times
Proposition 2 (Causal asymptotics for \(\widehat{AE}_{ES}\)) Let
Then \(\widehat{AE}_{ES}\) consistent for causal parameters: \[ \widehat{AE}_{ES} \xrightarrow{p} \E[Y_{i1}^1- Y_{i1}^0] = \E[Y_{i2}^1- Y_{i2}^0] \]
Can also connect \(\widehat{AE}_{ES}\) and OLS
Consider regression model \[ \begin{aligned} Y_{it} & = \beta_0 + \beta_1 D_{it} + u_{it}, \\ D_{it} & = \begin{cases} 1, & t= 1 \\ 0, & t =0 \end{cases} \end{aligned} \tag{2}\] where we simply treat \((Y_{i1}, D_{i1})\) and \((Y_{i2}, D_{i2})\) as separate observations
Proposition 3 (\(\widehat{AE}_{ES}\) is OLS) For \(\beta_1\) of regression (2) \[ \widehat{AE}_{ES} = \hat{\beta}_1^{OLS} \]
A way to think about regression in causal settings:
Write down the regression in terms of parameters of interest: e.g. let \[ \beta_0 = \E[Y_{i0}^0], \quad \beta_1 = \E[Y_{i2}^1- Y_{i2}^0] \]
Connect regression to potential outcomes: what is \(u_{it}\) in terms of potential outcomes?
Check properties of this \(u_{it}\). If \(u_{it}\) is “nice”, apply OLS (or another method)
Things happen
We want
Panel Data